41 research outputs found
Transportation data analysis. Advances in data mining and uncertainty treatment
2010/2011Nello studio dei sistemi di trasporto lâacquisizione e lâutilizzo di informazioni corrette e aggiornate sullo stato dei sistemi rappresentano da sempre elementi di centrale importanza per la produzione di analisi adeguate ed affidabili. Sfortunatamente in molti ambiti applicativi le informazioni disponibili per le analisi sono invece spesso carenti o di bassa qualitĂ , e il loro utilizzo si traduce in risultati affetti da elevata incertezza e talvolta di dubbia validitĂ .
I processi di evoluzione tecnologica che interessano campi quali lâinformatica, lâelettronica e le telecomunicazioni stanno rendendo progressivamente piĂš semplice e conveniente lâacquisizione di rilevanti quantitĂ di dati di interesse per le analisi trasportistiche, sia tradizionalmente raccolti per studi trasportistici (ad esempio dati di traffico rilevati su tronchi stradali) sia non direttamente connessi ad un uso trasportistico (ad esempio segnali Bluetooth e GPS provenienti da dispositivi di telefonia mobile).
Tuttavia in molti casi lâampia disponibilitĂ di dati, soprattutto nel secondo caso, non si traduce in immediata spendibilitĂ applicativa. I dati sono infatti spesso disomogenei dal punto di vista informativo, caratterizzati da una qualitĂ non necessariamente elevata e spesso richiedono onerosi processi di verifica e validazione. In questi particolari contesti lâapplicazione di tecniche di Data Mining può dimostrarsi una soluzione indubbiamente vantaggiosa. Esse infatti, per loro intrinseca natura, rendono possibile la gestione efficace di grandi quantitĂ di dati e la produzione di risultati sempre piĂš robusti allâaumentare delle dimensioni della base di dati disponibile.
Sulla base di queste considerazioni questo lavoro di tesi si è concentrato in primo luogo su unâattenta revisione delle piĂš consolidate tecniche di Data Mining, individuando gli ambiti applicativi, nel campo dei trasporti, in cui esse possono rappresentare dei validi strumenti di analisi.
Con il termine Data Mining si fa riferimento al processo di estrazione dellâinformazione presente in un certo insieme di dati, finalizzato ad individuare relazioni ânascosteâ nei dati stessi o comunque a sintetizzare in modalitĂ nuove la visione su di essi. Esso rappresenta una parte di un piĂš ampio processo di estrazione della conoscenza, che inizia con unâaccurata selezione e trasformazione dei dati disponibili (come detto i dati sottoposti a âminingâ sono spesso raccolti con altri obiettivi) e si conclude con unâattenta interpretazione e valutazione dei risultati. Uno schema di classificazione generalmente accettato suddivide le tecniche di Data Mining in sei categorie in rapporto alla funzione considerata: stima (reti neurali, modelli di regressione, alberi decisionali), previsione (reti neurali, alberi decisionali), classificazione (k-nearest neighbour, alberi decisionali, reti neurali), raggruppamento (tecniche di clustering, Self-Organising-Maps), associazione (regole di associazione) e descrizione (regole di associazione, clustering, alberi decisionali).
Nel presentare un quadro dâinsieme dellâampia letteratura esistente in materia, uno specifico rilievo è stato dato alle piĂš consolidate tecniche di classificazione, raggruppamento e associazione, in quanto maggiormente impiegate nei diversi contesti applicativi.
Successivamente è stato tracciato uno stato dellâarte per ciò che attiene le applicazioni in ambito trasportistico. In tal senso la revisione dei lavori prodotti ha evidenziato la notevole flessibilitĂ dâuso di queste tecniche e la loro crescente diffusione applicativa. Molti sono infatti i filoni di ricerca che hanno beneficiato di queste tecniche innovative; tra questi nel lavoro di tesi si sono evidenziati alcuni tra i piĂš interessanti: la previsione a breve termine dei flussi di traffico da dati storici o in real-time (traffic forecasting), lâidentificazione e la quantificazione dei fattori che influenzano i fenomeni di incidentalitĂ , lâanalisi di sistemi di gestione delle pavimentazioni stradali e di sistemi di monitoraggio del traffico.
La seconda parte della tesi si è invece focalizzata su unâapplicazione delle tecniche di Data Mining allo studio del funzionamento di un sistema viario, attraverso una revisione critica della Procedura FHWA (Federal Highway Administration) per il monitoraggio del traffico stradale. La scelta di questo filone di ricerca è data dal fatto che la raccolta di informazioni sui volumi di traffico è un aspetto rilevante nellâattivitĂ di pianificazione dei trasporti (ambito stradale), quale componente significativa del processo conoscitivo. Dâaltra parte i costi legati alla gestione dei sistemi di monitoraggio, sia per attrezzature che per personale, richiedono una crescente attenzione alla loro progettazione, al fine di ottenere la massima qualitĂ dei risultati.
Negli Stati Uniti la FHWA definisce periodicamente alcune linee guida per migliorare questi aspetti attraverso la Traffic Monitoring Guide (2001) e ha raggiunto progressivamente un ruolo di riferimento per altre agenzie dello stesso tipo in altre parti del mondo, Italia compresa. Tale procedura è basata sullâuso congiunto di rilievi di diversa durata (rilievi in continuo con strumenti fissi e rilievi di breve durata con apparecchiature portatili) ed è finalizzata principalmente alla stima del Traffico Giornaliero Medio Annuo (Annual Average Daily Traffic, AADT).
Lâanalisi della letteratura esistente ha individuato la lacuna principale della procedura FHWA nella determinazione dei gruppi tipologici di strade sulla base dei profili temporali di traffico e nellâassegnazione delle sezioni monitorate con rilievi di breve durata a questi gruppi. Lâapproccio elaborato si è pertanto posto lâobiettivo di migliorare la procedura relativamente a questi due aspetti rilevanti.
Per trattare lâesistenza di situazioni di incerta attribuzione di una sezione stradale ad un certo gruppo tipologico, specie quando non è semplice fornire una chiara definizione in termini trasportistici (ad esempio strada âpendolareâ o âturisticaâ), sono state adottate tecniche di Fuzzy Clustering, garantendo unâopportuna trattazione formale del problema. Per quanto concerne il secondo aspetto, le sezioni non monitorate in continuo vengono inserite nel gruppo tipologico piĂš simile rispetto ai profili temporali di traffico osservati. Per effettuare lâassegnazione di queste sezioni ai gruppi tipologici, lâapproccio proposto ha utilizzato una Rete Neurale Artificiale, opportunamente progettata per mantenere lâincertezza presente nella fase di creazione dei gruppi fino alla fine del processo. Lâoutput della rete è infatti rappresentato dallâinsieme delle probabilitĂ di appartenenza del rilievo di breve durata ai diversi gruppi tipologici ed è interpretato utilizzando la teoria di Dempster-Shafer. Le misure di incertezza associate allâoutput (indici di non-specificitĂ e discordanza) permettono di descrivere sinteticamente la qualitĂ dellâinformazione disponibile.
Lâapproccio proposto è stato implementato considerando i dati di monitoraggio provenienti dal programma SITRA (Sistema Informativo TRAsporti) della Provincia di Venezia. Rispetto allâambito applicativo di interesse è stata verificata la validitĂ dellâapproccio, confrontando i risultati ottenuti nella stima dellâAADT con precedenti approcci proposti in letteratura. Lâanalisi comparativa dei risultati ha permesso di rilevare una migliore accuratezza delle stime e soprattutto la possibilitĂ , assente nei precedenti approcci, di evidenziare eventuali carenze informative (dovute allâesiguo numero di dati) e la necessitĂ di procedere con ulteriori rilievi di traffico. I risultati positivi ottenuti in questa fase sperimentale hanno permesso di avviare il progetto per la realizzazione di uno strumento software di immediata spendibilitĂ applicativaIn the study of transportation systems, the collection and the use of correct information of the state of the system represent a central point for the development of reliable and proper analyses. Unfortunately in many application fields information is generally obtained using limited, scarce and low-quality data and their use produces results affected by high uncertainty and in some cases low validity.
Technological evolution processes which interest different fields, including Information Technology, electronics and telecommunications make easier and less expensive the collection of large amount of data which can be used in transportation analyses. These data include traditional information gathered in transportation studies (e.g. traffic volumes in a given road section) and new kind of data, not directly connected to transportation needs (i.e. Bluetooth and GPS data from mobile phones).
However in many cases, in particular for the latter case, this large amount of data cannot be directly applied to transportation problems. Generally there are low-quality, non-homogeneous data, which need time consuming verification and validation process to be used. Data Mining techniques can represent an effective solution to treat data in these particular contexts since they are designed to manage large amount of data producing results whose quality increases as the amount of data increases.
Based on these facts, this thesis first presents a review of the most well-established Data Mining techniques, identifying application contexts in transportation field for which they can represent useful analysis tools. Data mining can be defined as the process of exploration and analysis which aims to discover meaningful patterns and ââhiddenââ rules in the set of data under analysis. Data Mining could be considered a step of a more general Knowledge Discovery in Databases Process, which begins with selection, pre-processing and transformation of data (âminedâ data are generally collected for reasons different from the analysis) and is completed with the interpretation and evaluation of results. A classification scheme generally accepted identifies six categories of DM techniques, which are related to the objective one would achieve from the analysis: estimation (neural networks, regression models, decision trees), prediction (neural networks, decision trees), classification (k-nearest neighbor, decision trees, neural networks), clustering (clustering techniques, Self-Organizing-Maps), affinity grouping or association (association rules) and profiling (association rules). In the review of the wide literature concerning Data Mining methods, particular attention has been devoted to the well-established technique of clustering, classification and association, since they are the most applied in different application contexts.
The literature review process has been further extended to Data Mining applications in the transportation field. This review highlights the great flexibility of use of these techniques and the increasing number of applications. Many research topics have taken advantages of these innovative tools and some of them are presented due to their interest: short-term traffic flow forecasting from historical and real-time data, identification and quantification of factor risks in accident analysis, analysis of pavement management systems and traffic monitoring systems.
The second part of the thesis has focused on the application of Data Mining techniques to road system analysis, through a critical review of U.S. Federal Highway Administration (FHWA) traffic monitoring approach. The choice of this topic is due to the fact that traffic monitoring activities represent a relevant aspect of highway planning activities, as a part of the knowledge process. However data collection activities produce relevant management costs, both for equipment and personnel, therefore monitoring programs need to be designed with attention to obtain the maximum quality of results.
In the U.S.A., the Federal Highway Administration (FHWA) provides guidance for improving these aspects by way of its Traffic Monitoring Guide (TMG) (FHWA, 2001), which has a reference role for other similar agencies in the world. The FHWA procedure is based on two types of counts (short duration counts taken with portable traffic counters and continuous counts taken with fixed counters) and has the main objective of determine the Annual Average Daily Traffic (AADT).
Critical review of literature on this topic has pointed out that the most critical aspects of this procedure are the definition of road groups based on traffic flow patterns and the assignment of a section to a road group using short counts. The proposed approach has been designed to solve both issues.
The first issue is related to situations for which road section could belong to more than one road group, and the groups cannot be easily defined in transportation terms, (e.g. âcommuter roadâ, ârecreational roadâ). The proposed approach introduces Fuzzy Clustering techniques, which adopt an analytical framework consistent with this kind of uncertainty. Concerning the second issue, road sections monitored with short counts are assigned to the road group with more similar traffic patterns. In the proposed approach an Artificial Neural Network is implemented to assign short counts to roads groups. The Neural network is specifically designed to maintain the uncertainty related to the definition of road groups until the end of the estimation process. In fact the output of the Neural Network are the probabilities that the a specific short counts belongs to the road groups. These probabilities are interpreted using the Dempster-Shafer theory; measures of uncertainty related to the output (indices of non-specificity and discord) provide an assessment of the quality of information in a synthetic manner.
The proposed approach have been implement on a case study, using traffic data from SITRA (Sistema Informativo TRAsporti) monitoring program of the Province of Venice. In this specific context the approach has been validated and the results obtained (AADT estimates) from the proposed method have been compared with those obtained by two approaches proposed in previous studies. The comparative analysis highlights that the proposed approach increases the accuracy of estimates and gives indication of the quality of assignment (depending on sample size) and suggests the need for additional data collection.
The positive results obtained in the experimental phase of the research have led to the design of a software tool to be used in next future in real world applications.XXIV Ciclo198
freeway rear end collision risk estimation with extreme value theory approach a case study
Abstract The current practice in crash-based safety analysis is hindered by some weaknesses: rarity of crashes, lack of timeliness, mistakes in crash reporting. Researchers are testing alternative approaches to safety estimation without the need of crash data. This paper presents an application of Extreme Value Theory in road safety analysis, using Time-To-Collision as a surrogate safety measure to estimate the risk to be involved in a freeway rear-end collision. The method was tested using data from an Italian toll-road with good results
comparison of exhaust emissions at intersections under traffic signal versus roundabout control using an instrumented vehicle
Abstract The traditional approach to the comparison of alternative types of road intersection control has focused mainly on efficiency and safety. In recent years, the increasing importance of air pollution produced by vehicular traffic has suggested that environmental considerations should be added to the above aspects as a criterion for intersection design. This study describes a before-and-after analysis conducted on a road intersection where a roundabout has replaced a traffic signal. Using a Portable Emission Measurement Systems (PEMS) installed on a test car, the instantaneous emissions of CO2, NOX and CO have been measured over repeated trips along a designated route. A total of 396 trips have been carried out in different traffic conditions and in opposite directions along the chosen route. Using statistical methods the existence of significant differences in emissions attributable to the type of intersection control has been investigated based on the experimental data. The results indicate that replacing the traffic signal with the roundabout tends to reduce CO2 emissions, even if the differences are not always statistically significant; on the contrary, the signalized intersection performs better in terms of NOX emissions. Finally, results are less clear for CO emissions, and differences are statistically non significant in most cases
Network Design Model with Evacuation Constraints Under Uncertainty
Abstract: Nepal earthquake, have shown the need for quick response evacuation and assistance routes. Evacuation routes are, mostly, based on the capacities of the roads network. However, in extreme cases, such as earthquakes, roads network infrastructure may adversely affected, and may not supply their required capacities. If for various situations, the potential damage for critical roads can be identify in advance, it is possible to develop an evacuation model, that can be used in various situations to plan the network structure in order to provide fast and safe evacuation. This paper focuses on the development of a model for the design of an optimal evacuation network which simultaneously minimizes construction costs and evacuation time. The model takes into consideration infrastructures vulnerability (as a stochastic function which is dependent on the event location and magnitude), road network, transportation demand and evacuation areas. The paper presents a mathematic model for the presented problem. However, since an optimal solution cannot be found within a reasonable timeframe, a heuristic model is presented as well. The heuristic model is based on evolutionary algorithms, which also provides a mechanism for solving the problem as a stochastic and multi-objective problem
on road measurement of co2 vehicle emissions under alternative forms of intersection control
Abstract The environmental impact of road intersection operations, and in particular of alternative types of traffic control, has received increasing attention in recent years as a factor to be considered in addition to efficiency and safety. The purpose of this study is to provide experimental evidence about this issue based on direct measurement of CO2 emissions produced by a vehicle under traffic signal versus roundabout control. Carbon Dioxide was chosen as specific target of the analysis because of its important contribution to the "greenhouse effect". Using data collected with a Portable Emission Measurement System (PEMS) installed on a test car, a before-and-after analysis was conducted on an intersection where a roundabout has replaced a traffic signal. A total of 396 trips were carried out by two drivers in different traffic conditions and in opposite directions along a designated route. Using statistical methods, the existence of significant differences in CO2 emissions in relation to the type of intersection control was investigated based on the collected data, also considering the effect of other explanatory variables and focusing in particular on peak traffic conditions. More precisely, the effect of the type of control has been characterized using descriptive statistics and permutation tests applied to the entire data set, while an analysis based on binary logistic regression has been performed with specific reference to trips carried out under peak traffic conditions. The results of these analyses support the conclusion that converting a signal-controlled intersection to a roundabout may lead to a decrease in CO2 emissions
Annual Average Daily Traffic Estimation from Seasonal Traffic Counts
This paper presents an approach to estimation of the Annual Average Daily Traffic (AADT) from a one-week seasonal traffic count (STC) of a road section, with the aim of improving the interpretability of results with measures of non-specificity and discord. The proposed method uses fuzzy set theory to represent the fuzzy boundaries of road groups and measures of uncertainty. Neural networks are used to assign a road segment to one or more predefined road groups. The approach was tested with data obtained in the Province of Venice, Italy, for the period of the year in which STCs are taken. The method was found to produce accurate results
Analysis of drivers\u2019 behavior in different environments: experiments with a driving simulator
Driver fatigue is a multidimensional and complex subject, addressed in past years by many researchers but not completely defined and clarified: terms like drowsiness, sleepiness and fatigue are often used like synonymous. The importance of studying fatigue is related to the fact that it represents a contributing factor in many crashes every year.
Following a sub categorization of fatigue concept recently proposed by May and Baldwin (2009) this paper focuses on the analysis of passive task-related (TR) effects of highway driving in monotonous environments. The analysis proposed in this paper was based on results obtained using a driving simulator approach, which has been widely adopted in recent years for this kind of studies, given the opportunity to analyze risky driving conditions in a safe and controlled environment.
Differently from previous studies, this paper analyzes the effects of monotonous environment separating its effects from other causal factors of fatigued state, aiming at a better evaluation of their relative importance and of the onset of driving fatigue phenomenon.
Mixed-effects models were chosen as suitable analysis tool to deal with the objectives of this study
Effects of Driver Task-related Fatigue on Driving Performance
In this study, passive task-related fatigue effects on highway driving were analyzed by means of driving simulator experiments. Ten drivers were asked to drive in various environments in the morning (9:00-11:00 a.m.) and early afternoon (1:00-3:00 p.m.). Mean of Absolute Steering Error and Standard Deviation of Lateral Position, calculated on sub-intervals of 4 minutes, were analysed as response variables. The results confirmed the negative influence of the duration of driving tasks and circadian effects on driving performance, increasing the likelihood of \u201cnear misses\u201d and accidents
Comparison of Clustering Methods for Road Group Identification in FHWA Traffic Monitoring Approach: Effects on AADT Estimates
Defining road groups is the first step in the FHWA factor approach procedure for Annual Average Daily Traffic (AADT) estimation and is one of the main sources of errors in AADT estimates. This paper focuses on a comparative analysis of cluster analysis methods to identify road groups with similar traffic patterns according to different combinations of seasonal adjustment factors calculated for passenger vehicles and trucks. The aim is to highlight the differences among methods and input variables in the AADT estimation process, optimizing information commonly available to analysts. The analysis made use of traffic data from fifty Automatic Traffic Recorder (ATR) sites in the Province of Venice, Italy. The estimation accuracy of the clustering methods was assessed and compared by considering the values of Mean Absolute Percent Error in AADT estimates. The performance of clustering methods was found to differ, depending on datasets and traffic patterns. Particularly significant for the accuracy of AADT estimates was the choice to use seasonal adjustment factors disaggregated by vehicle type as input variables